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Abstract. The results of several papers concerning the Cerny conjec- 
ture are deduced as consequences of a simple idea that I call the averag- 
ing trick. This idea is implicitly used in the literature, but no attempt 
was made to formalize the proof scheme axiomatically. Instead, authors 
axiomatized classes of automata to which it applies. 



1 Introduction 

Recall that a (complete deterministic) automaton stf = (Q, S) with state 
set Q and alphabet E is called synchronizing if there is a word w 6 U* 
such that \Qw\ = 1. The word w is called a synchronizing word. The main 
conjecture in this area is: 

Conjecture 1 (Cerny [lj). An n-state synchronizing automaton ad- 
mits a synchronizing word of length at most (n — l) 2 . 

There is a vast literature on this subject. See for example [Trl25|. The 
best known upper bound is cubic [26J, whereas it is known that one cannot 
do better than (n — l) 2 pQ. 

My goal here in this note is not to prove the Cerny conjecture for a 
new class of automata, but rather to give a no- frills, uniform approach 
to an argument that underlies a growing number of results in the Cerny 
conjecture literature (cf. [7,13,20rl22|IM]). Underlying all these results (as 
well as the more difficult results of [5] and [17]) are two simple ideas: 

— if a finite sequence of numbers is not constant, then it must at some 
place exceed its average; 

— finite dimensional vector spaces satisfy the ascending chain condition 
on subspaces. 
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The latter idea is often cloaked in the language of rational power series. 

The paper is organized as follows. In the next section I state what I 
call the "Averaging Lemma." It is a method, with a probabilistic flavor, 
for obtaining bounds on lengths of synchronizing words. Before proving 
the lemma, I show how to deduce from it Kari's solution of the Cerny 
conjecture for Eulerian automata, as well as recent results of Beal and 
Perrin [20] for one-cluster automata and Carpi and d'Alessandro [2T|[22] 
for (locally) strongly transitive automata. We also recover an old result 
of Rystsov [7] on regular automata (which is essentially the same thing 
as strongly transitive automata). In fact, we obtain new generalizations 
of all these results. The final section proves the Averaging Lemma. 

2 The averaging trick 

Let E be an alphabet. Denote by E* the free monoid on E and put 

d 

The ring of polynomials with real coefficients in the non-commuting vari- 
ables E is denoted RE. By a (finitely supported) probability on E*, we 
mean an element 

P= Yl P ( w ) w G KZ 

w&E* 

such that: P(w) > for all w £ E* , and 

E p H = L 

wen* 

The support of P is 

o-(P) = { W £ E* \ P(w) > 0}. 

Notice that if Pi and Pi are probabilities, then so is P\Pi- Also note that 
o(P x P 2 ) = a{P 1 )a{P 2 ). 

If X : E* — > M is a random variable, then the expected value of X 
(with respect to the probability P) is: 

E P (X) = £ P(w)X(w) = £ P(w)X(w). (1) 

wes* wea(P) 
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The fundamental property of a random variable that we exploit in this 
paper is that either it is almost surely constant (and equal to its ex- 
pectation), or with positive probability it exceeds it expectation. More 
precisely, it is immediate from (pQ) and the definition of a probability that 
either X(w) = Ep(X) for all w € c(P), or there is a value w £ ff(P) with 
X(w) > E P (X). 

Suppose now that si = (Q, U) is an automaton with \Q\ = n. We view 
elements of IRQ as row vectors. Let ir: MU — > M n (R) be the corresponding 
matrix representation (cf. [27]); so if 

weS* 

and q,r £ Q, then 

<f) q ,r = E /H" 

We shall usually omit 7r from the notation and view ~EL£ as acting on row 
and column vectors. If 5 C Q, then [S] denotes the characteristic row 
vector of S; e.g., [Q] is the all ones row vector. We use [S] T to denote the 
transpose vector. A key fact is that tufS 1 ] 7 ^ = [Siu~ 1 } T for w G U*, where 
as usual Su> _1 = {q £ Q \ qw € S 1 }. 

Lemma 2 (Averaging Lemma). Lei si = (Q,U) be a synchronizing 
automaton with n states, let P\ be a probability on E* and let R C Q. Set 
c = 2 if, for each proper non-empty subset SCR, there exist W\,W2 £ 
cr(Pi) with Swr 1 ^ Sw^ 1 and otherwise put c = 1. Suppose that there 
exists a probability P2 with support JJ- n ^ c such that: 

1. [R]P 2 Pi = [R]; 

2. KC qS* for all q £ R; 

3. there exists wq £ S* with Qwq C R. 

Then si has a synchronizing word of length at most: 

- c + (n - 2)(n - c + L) if R = Q; 

- (r-l)(n-c + L)+£ + c-l ifRCQ 

where r = \R\, L is the maximum length of a word in cr{P\) and I = \wo\. 

Remark 3. If r is odd, then the proof shows that the bounds in Lemma[2] 
can be improved to 1 + (n — 2)(n — c + L) and (r — l)(n — c + L) + £, 
respectively. 
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Before, proving the lemma, let us use it to derive anew some results 
from the literature. The first is a result of Kari on synchronizing Eulerian 
automata [13]. An automaton is Eulerian if its underlying graph admits 
an Eulerian directed path, or equivalently, it is strongly connected and 
the in-degree of every vertex is the same as the out-degree (and hence is 
the alphabet size). Actually, we can generalize his result. 

Let us say that a strongly connected automaton s/ = (Q, E) is pseudo- 
Eulerian if we can find a probability P with support E such that the 
matrix %{P) is doubly stochastic (i.e., each row and column of P adds 
up to 1). For instance, if si is Eulerian with adjacency matrix A and 
d = |I7|, then we can set 

P = £ d-'a. 

One checks that vr(P) = d~ l A, and hence is doubly stochastic by the 
Eulerian hypothesis. Thus every Eulerian automaton is pseudo-Eulerian. 
It is easy to check whether a strongly connected automaton is pseudo- 
Eulerian: one just needs to look for a strictly positive solution to the 
system of \Q\ + 1 linear equations 

i = ^2pa-\qa~ 1 \ (qeQ). 

The automaton in Figure Q] is pseudo-Eulerian but not Eulerian. Indeed, 




a c 



Fig. 1. A pseudo-Eulerian automaton 
if we put P = a/2 + 6/6 + c/3, then 



is doubly stochastic. 



Theorem 4. An n-state synchronizing pseudo-Eulerian automaton has 
a synchronizing word of length at most 1 + (n — 2)(n — 1). 

Proof. Let = (Q, E) and suppose that P is a probability with support 
U such that 7r(P) is doubly stochastic. Let Pi be the probability with 
support concentrated on the empty word and take R = Q. As pseudo- 
Eulerian automata are strongly connected, Q C qE* for all q £ Q. Put 



it is a probability with support E- n 1 . The condition that vr(P) is doubly 
stochastic is equivalent to [Q}P = [Q]. Thus 



The Averaging Lemma now yields the upper bound of 1 + (n — 2)(n — 1) 



The next result simultaneously generalizes results of Rystsov [7] on 
regular automata, Beal [24| on circular automata, Beal, Berlinkov and 
Perrin [20.28J on one-cluster automata and Carpi and d'Alessandro |2ip22j 
on strongly and locally strongly transitive automata. 

Theorem 5. Let &/ = (Q, S) be a synchronizing automaton. Suppose 
there is a set of words W Q U* and k > 1 so that, for each state q 6 Q 
and each state s S R = QW , there are exactly k elements of W taking q 
to s. Let i be the length of the shortest word in W and L be the length of 
the longest. Lf R = Q, then there is a synchronizing word for of length 
at most 2 + (n — 2)(n — 2 + L); if R C Q, then there is a synchronizing 
word of length at most (r — l)(n — 2 + L) + £ + 1 where r = \R\. 

Proof. A straightforward counting argument establishes that \W\ = kr. 
It remains to define our probabilities in order to apply the Averaging 
Lemma. Take P\ to be the uniform distribution on W (so P\{w) = 1/\W\ 
for w E W and is otherwise 0). To verify that c = 2, let ^ S C R 
and suppose that s € S and q G R \ S. Then by the hypothesis on W, 
there exist 101,102 € W with rw\ = s and qw2 = q. Then q E Sw^ 1 but 
q £ SW2 1 . 





m=0 



on the length of a synchronizing word. 



□ 
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Now let P2 be an arbitrary probability with support S-~ c . The only 
condition remaining to check in order to apply the Averaging Lemma is 
that [i?]P2-Pi = [R]- First observe that the columns of tt(Pi) correspond- 
ing to elements of Q \ R are zero, while if s € R, then the corresponding 
column of k{P x ) is (k/\W\)[Q] T = (l/r)[Q] T . Since vr(P 2 ) is a stochastic 
matrix (each of its rows sum to 1), this means that ^(i^-Pi) = k{P\). 
Next observe that if s £ R, then s Ylwew w = k[R]. Thus 

i R ] Y w = Y s Y w = rk ^ = \ w \i R }- 

Therefore, [R]Pi = [R] and hence [R]P2Pi = [R], as required. □ 

For example, Beal and Perrin |2U] call srf = (Q, S) a one-cluster 
automaton if there exists a £ £ so that a has only one cycle R on Q; see 
Figure [2j Suppose that the cycle has size r. Then each state of Q is taken 
to exactly one element of R by the set of words W = {a n ~ r , . . . ,a n_1 }. 
Theorem[5]then yields the bound of 2n 2 — 7n+8. This should be compared 




Fig. 2. a-skeleton of a one-cluster automaton with n = 15 and r = 5. 

with the bound of 2n 2 — 7n + 7 from |28j , which improves on the earlier 
bound of 2n 2 — 6n + 5 from [20] . Indeed, if r = n, Theorem [5] immediately 
yields a bound of 2 + (n — 2)(2n — 3) = 2n 2 — 7n + 8. Otherwise, using 
L = n — 1 and £ = n — r, we obtain a bound of 

(r - l)(2n - 3) + n - r + 1 = r(2n - 4) - n + 4 

< (n - l)(2n -4) -n + 4 
= 2n 2 - 7n + 8. 
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Similarly, one recovers the results of Rystsov [7] and the results of Carpi 
and d'Alessandro [2TJ[22] with an improved bound. Indeed, the locally 
strongly transitive automata of [22J constitute the special case of Theo- 
rem [5] where k = 1. Rystsov's notion of a regular automaton is essentially 
(but slightly more rigid) than the case R = Q. 

The proof of Theorem [5] can easily be adapted to obtain the same 
bound if W is an arbitrary set of words such that there is a probability 
Pi supported on W so that each column of n(Pi) corresponding to an 
element of Q \ R is 0, whereas each column corresponding to an element 
of R is l/r[Q} T . 

3 Proof of the Averaging Lemma 

The proof of the Averaging Lemma rests on our observation about expec- 
tations of random variables and the ascending chain condition for finite 
dimensional vector spaces. Suppose that U* acts on the left of a vector 
space V by linear maps. Let X C U* and let W be a subspace. Then by 
XW, we mean the span of all vectors xw with x € X and w € W. 

Lemma 6. Let tt: U* — >• M n (K) be a matrix representation with K a 
field. Suppose that one has subspaces W, V C K n of column vectors with 
W C V , but U*W ^ V. Let S be a spanning set for W . Then there exist 
s € S and w G U* with \w\ < dim^ — dim TV + 1 and ws ^ V. 

Proof. Put W m = U- m W . Then there is an ascending chain of subspaces 

W = W C Wi C W 2 C • • • 

and moreover as soon as this chain stabilizes it equals U*W. By our 
assumption, there is a greatest m > with W m C 1/. In particular, the 
chain does not stabilize until after m steps and so 

Wo c Wi c • • • c w m c y 

and hence dim Wo + m < dim V, that is, m + 1 < dim^ — dim W + 1. 
Therefore, there is a word w 6 £* with |to| < dimF — dim W + 1 and 
wW V. But W is spanned by S 1 , so we can find s € S with ^ V. 

Proof (of the Averaging Lemma). For convenience, put X = a (Pi). We 
show that for each ^ S C i2, there exists G i7* of length at most 
n — c + L with IS'w -1 n i?| > \S\ except for when c = 2 and |5| = r/2, in 
which case we can only guarantee that w has length at most n — 1 + L. 
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If R = Q, the result is then immediate: one can find a state q G Q and a 
letter a G Z 1 so that |(?a _1 | > 1; now we expand by inverse images n — 2 
times with words of length at most n — c + L (except for when c = 2 and 
| S | = r/2, in which case we expand by n — 1 + L) to obtain the result. If 
R C Q, we can find iu of length at most (r — l)(n — c + L) + c — 1 with 
| .Rio | = 1 using the same idea. Then as Qwq C i?, it follows IQ^o^l < 
|-Ru>| = 1. This yields the bound of (r — l)(n — c + L) + I + c — 1 on the 
length a synchronizing word. 

Consider the probability P = PiP\ on U* and define a random vari- 
able Z s : Z* -> R by 

Z 5 H = (Su^ 1 ni?| = [R^Sw'Y = [R][w][S] T . 

Let us compute the expected value of this random variable: 

Bp(Z s ) = Piw^Sw- 1 n R\ = P(w)[R]w[S} T 

= [R]P[S] T = [R]P 2 Pi[S] T = [R][S] T 
= \S\ 

where we have used [R]P2Pi = [R]. The support of P is a(P2)a(Pi) = 
U^ n -°X. If we can find v G E^ n ' c X with Z s (v) = {Sv' 1 n R\ + \S\, 
then we can find w G E^ n ~ c X with jSw" 1 n R\ = Z s {w) > \S\ by our 
discussion earlier on random variables that are not almost surely constant. 
As \w\ < n — c + L, this will finish the proof. 

If {Sx^ 1 Pi R\ 7^ \S\ for some x G X, then we are done. Otherwise, 
we may assume {Sx^ 1 n R\ = \S\ for all x £ X. Let 7 be the col- 
umn vector [S] T — (\S\/r)[Q] T . Notice that if w G U*, then one has 
W1 = [Sw- 1 ] T - (\S\/r)[Q] T and so [R]vry = \Suj- 1 n R\ - \S\. In par- 
ticular, if x G X our assumption implies [R]xj = 0. Moreover, X7 / 
as \S\ < r. Thus if W is the subspace spanned by the column vectors X7 
with x e X, then / W C [i?]- 1 . 

Our next goal is to verify that dimH^ > c unless c = 2 and \S\ = r/2 
(in which case it is at least 1). The only non-trivial case is when c = 2 
and 1 5 1 7^ r/2. Then we can find w\,W2 G X with Sw^ 1 / Sw^ 1 - We 
claim that 10 17 and W27 are linearly independent elements of W. Indeed, 
if they were linearly dependent, then since both vectors are non-zero we 
must have w\j = kw2j for some k G R. Moreover, k ^ 1 because 1 / 
Su^ 1 . Thus ^-ife^wj 1 ] 3, = (|5|/r)(l-A:)[Q] T . Since is the all 
ones column vector and [Swi 1 ] T , [Su^ 1 ] 7 " are column vectors of zeroes 
and ones, it follows that k = — 1 and Sw^ 1 , Sw^ 1 are complementary 
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subsets of Q. Then we obtain [Q] T = (2\S\/r)[Q] T , whence \S\ = r/2, a 
contradiction. We conclude that wij and u>27 are linearly independent 
and so dim W > 2 = c. 

Our next claim is that E*W ^ Indeed, let wbea synchronizing 

word. Then wwq synchronizes s/ to an element of q € R. But qU* D R, so 
we can synchronize to any state of R. In particular, we can synchronize s/ 
via some word y into Sx~~ 1 nR for some x € X. Then Sx~ 1 y~ 1 = Q and so 
[Rlyxj = \Sx~ 1 y~ 1 Pi R\ — \S\ > 0. This shows that yx^j £ R 1 - and hence 
E*W [R] ■ As dim W > c and dimfi?] -1 - = n — 1, Lemma [6] now provides 
n G X'- n_c and z£l with uz'j £ {R}^- Putting v = uz G Z 1 -™ - ^, we 
have ^ [72] Try = [Sw -1 n R\ — \S\. This completes the proof. □ 

Remark 7. The above proof and the proof of the main result of [28] give 
an improved bound for one-cluster automata. It is shown in |28j that if we 
have an n-state one-cluster automaton with unique a-cycle R of length r, 
then we can find a state q € R and a word w of length at most 2n — r — 1 
such that \qw~ l fl R\ > 1. Since the Cerny conjecture is proved for the 
case r = n [5], we may assume r < n — 1. Combining this with the above 
proof yields a bound of 

(r - 2)(2n - 3) + 2n - r - 1 + n - r + 1 = (r - 2)(2n - 3) + 3n - 2r 

= r(2n — 5) — n + 6 
< (n - l)(2n - 5) -n + 6 
= 2n 2 - 8n + 11. 
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